1-10 Canceled) 



1 1 . (Previously amended) A machine executable program embodied on a computer- 

readable medium for use in a voice query recognition system that is distributed 
across a client system and a separate server system, the program including 
computer-executable instructions comprising: 

a first audio signal receiving routine for receiving user speech utterance signals 
representing speech utterances to be recognized during a sequence of speech 
utterance evaluation time frames, said speech utterances including sentences 
comprised of one or more words; and 

a first signal processing routine adapted to generate representative speech data 
values for each speech utterance evaluation time frame during which speech utterance 
signals are received, said representative speech data values including a set of 
compressed mel-frequency cepstral coefficients (MFCC); 

a formatting routine for rendering said representative speech data values into a 
transmission format suitable for transmission from the client system over a 
communications channel to a second processing routine executing on the server 
computing system; and 

wherein said representative speech data values are transmitted continuously 
during said speech utterances within streaming packets and without waiting for silence 
to be detected and/or said speech utterances to be completed; 

further wherein said representative speech data values constitute a minimum 
amount of information that can be used by said second processing routine to complete 
accurate recognition of said one or more words and said sentences; 

and said communications channel further being configured by the machine 
executable program such that grammar related information is sent to the second 
processing routine executing on the server computing system for identifying a grammar 
to be used for recognition of said one or more words and said sentences. 

12. (Original) The program of claim 1 1 , wherein said program works within a browser 
program executing on said computing system as part of a client-server based system. 
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1 3. (Previously amended) The program of claim 1 1 , wherein said set of compressed 
MFCCs is generated at a rate corresponding to at least 100 frames per second, and 
such that said set of compressed MFCCs includes a separate cepstral coefficient value 
for a corresponding frequency component of said user speech utterance signals, and 
said first data content corresponds to a set of said frequency components spanning an 
audible speech frequency range. 

14. (Previously amended) The program of claim 1 1 , wherein additional data content 
including a set of delta and acceleration coefficients is computed from said set of 
compressed MFCCs at either said client system or said server computing system on a 
connection by connection basis based on an evaluation of computing resources 
available at such client system and said server computing system. 
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15. (Previously Amended) A machine executable program embodied on a computer- 
readable medium for use in a voice query recognition system that is distributed 
across a client system and a separate server system, the program including 
computer-executable instructions comprising: 

a first audio signal receiving routine for receiving user speech utterance signals 
representing speech utterances to be recognized during a sequence of speech 
utterance evaluation time frames, said speech utterances including sentences 
comprised of one or more words; and 

a first signal processing routine adapted to generate representative speech data 
values for each speech utterance evaluation time frame during which speech utterance 
signals are received, said representative speech data values including a set of 
compressed mel-frequency cepstral coefficients (MFCC); 

a formatting routine for rendering said representative speech data values into a 
transmission format suitable for transmission from the client system over a 
communications channel to a second processing routine executing on the server 
computing system; and 

wherein said representative speech data values are transmitted continuously 
during said speech utterances within streaming packets and without waiting for silence 
to be detected and/or said speech utterances to be completed; 

further wherein said representative speech data values constitute a minimum 
amount of information that can be used by said second processing routine to complete 
accurate recognition of said one or more words and said sentences; 

and said communications channel further being configured by the machine 
executable program such that grammar related information is sent to the second 
processing routine executing on the server computing system for identifying a grammar 
to be used for recognition of said one or more words and said sentences; 

wherein said second processing routine is configured with an amount of 
resources by said server computing system based on a bandwidth and transmission 
speed associated with a transmission link between said server computing system and 
said client system so that said second processing routine performs accurate recognition 
of said one or more words with a first latency that is less than a second latency that 
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would result if said one or more words were recognized by said first signal processing 
routine and then transmitted over said transmission link. 



16-45 (Canceled) 
46. (Canceled) 
47 - 52. (Canceled) 
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53. (Previously amended) A method of performing distributed voice recognition 
comprising the steps of: 

(a) receiving user speech utterance signals representing speech utterances to be 
recognized during a sequence of speech utterance evaluation time frames, said 
speech utterances including sentences comprised of one or more words; and 

(b) generating representative speech data values with a first processing circuit for 
each speech utterance evaluation time frame during which speech utterance 
signals are received, said representative speech data values including a set of 
compressed mel-frequency cepstral coefficients (MFCC); 

(c) encoding said representative speech data values into a transmission format 
suitable for transmission over a communications channel to a second processing 
circuit; and 

further wherein said representative speech data values constitute a minimum 
amount of information that can be used by said second processing circuit to 
complete accurate recognition of said one or more words and said sentences; 
(d) communicating grammar related information over said communications 

channel to the second processing routine to specify a grammar to be used for 

recognition of said one or more words and said sentences. 

54. (Original) The method of claim 53, wherein said recognition of said one or more 
words occurs in real-time. 

55. (Previously amended) The method of claim 53, wherein said set of compressed 
MFCCs is generated at a rate corresponding to at least 100 frames per second, and 
such that said set of compressed MFCCs includes a separate cepstral coefficient value 
for a corresponding frequency component of said user speech utterance signals, and 
said first data content corresponds to a set of said frequency components spanning an 
audible speech frequency range. 
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56. (Previously amended) The method of claim 55, wherein a set of delta and 
acceleration coefficients are computed from said cepstral coefficient values to complete 
recognition of said one or more words and said sentences, wherein such set of delta 
and acceleration coefficients are computed at either said first processing circuit or said 
second processing circuit on a connection by connection basis based on an evaluation 
of computing resources available at such respective processing circuits. 
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57. (Previously Amended) A method of performing distributed voice recognition 
comprising the steps of: 

(a) receiving user speech utterance signals representing speech utterances to be 
recognized during a sequence of speech utterance evaluation time frames, said 
speech utterances including sentences comprised of one or more words; and 

(b) generating representative speech data values with a first processing circuit for 
each speech utterance evaluation time frame during which speech utterance 
signals are received, said representative speech data values including a set of 
compressed mel-frequency cepstral coefficients (MFCC); 

(c) encoding said representative speech data values into a transmission format 
suitable for transmission over a communications channel to a second processing 
circuit; and 

further wherein said representative speech data values constitute a minimum 
amount of information that can be used by said second processing circuit to 
complete accurate recognition of said one or more words and said sentences; 
(d) communicating grammar related information over said communications 
channel to the second processing routine to specify a grammar to be used for 
recognition of said one or more words and said sentences; 

wherein said second processing circuit is configured with an amount of resources 
by a server computing system based on a bandwidth and transmission speed 
associated with a transmission link between said server computing system and a client 
system associated with the first processing circuit, so that said second processing 
circuit performs accurate recognition of said one or more words with a first latency that 
is less than a second latency that would result if said one or more words were 
recognized by said first processing circuit and then transmitted over said transmission 
link. 
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58 - 70 (Canceled) 

71 . (Previously submitted) The machine executable program of claim 1 1 , further 
including a query/answer routine adapted to transmit a speech based query over 
the communications channel in response to a button being pressed on the client 
system. 

72. (Previously submitted) The machine executable program of claim 71 , wherein 
said client system is a portable electronics device and a data content for said 
representative speech data values is configured based on a processing ability of 
such device. 

73. (Previously submitted) The method of claim 53, further including a step: 
transmiting a speech based query over the communications channel in response 
to a button being pressed on a client system containing said first processing 
circuit. 

74. (Previously submitted) The machine executable program of claim 73, wherein 
said client system is a portable electronics device and a data content for said 
representative speech data values is configured based on a processing ability of 
such device. 
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